Zhiyuan Li (李志远)

prof_pic.jpg

Office: TTIC 508

I am a tenure-track assistant professor at Toyota Technological Institute at Chicago (TTIC) and an affiliated faculty of Computer Science at the University of Chicago. I am also a visiting faculty at Google Research. Before joining TTIC, I was a postdoctoral fellow in Computer Science Department at Stanford University, working with Tengyu Ma. I received my PhD from the Computer Science Department at Princeton University in 2022, where I was advised by Sanjeev Arora. I did my undergraduate study at Yao Class, Tsinghua University.

I am broadly interested in machine learning theory, including optimization in deep learning, reasoning capabilities of Large Language Models (LLMs), modern paradigm of generalization in machine learning (overpatametrization, out-of-domain generalization) and its connection to the implicit bias of optimization algorithms.

News

Jun 23, 2025 Excited to co-organize and participate in Midwest Machine Learning Symposium at Uchicago!
Jun 09, 2025 Presenting PENCIL at 2025 Annual Meeting of IDEAL Institute.
May 05, 2025 Serving as Area Chair for NeurIPS 2025
May 01, 2025 4 papers accpected by ICML 2025!
Jan 31, 2025 Serving as Area Chair for ICML 2025

Selected and Recent publications

  1. Structured Preconditioners in Adaptive Optimization: A Unified Analysis
    Shuo Xie, Tianhao Wang, Sashank Reddi, Sanjiv Kumar, and Zhiyuan Li
    In Proceedings of the 42nd International Conference on Machine Learning, ICML 2025
  2. PENCIL: Long Thoughts with Short Memory
    Chenxiao Yang, Nathan Srebro, David McAllester, and Zhiyuan Li
    In Proceedings of the 42nd International Conference on Machine Learning, ICML 2025
  3. Weak-to-Strong Generalization Even in Random Feature Networks, Provably
    Marko Medvedev, Kaifeng Lyu, Dingli Yu, Sanjeev Arora, Zhiyuan Li, and Nathan Srebro
    In Proceedings of the 42nd International Conference on Machine Learning, ICML 2025
  4. Non-Asymptotic Length Generalization
    Thomas Chen, Tengyu Ma, and Zhiyuan Li
    In Proceedings of the 42nd International Conference on Machine Learning, ICML 2025
  5. Adam Exploits $\ell_\infty$-geometry of Loss Landscape via Coordinate-wise Adaptivity
    Shuo Xie, Mohamad Amin Mohamadi, and Zhiyuan Li
    In The Thirteenth International Conference on Learning Representations, ICLR 2025
  6. Chain of Thought Empowers Transformers to Solve Inherently Serial Problems
    Zhiyuan Li, Hong Liu, Denny Zhou, and Tengyu Ma
    In The Twelfth International Conference on Learning Representations, ICLR 2024
  7. How Does Sharpness-Aware Minimization Minimize Sharpness?
    Kaiyue Wen, Tengyu Ma, and Zhiyuan Li
    In The Eleventh International Conference on Learning Representations, ICLR 2023
  8. What Happens after SGD Reaches Zero Loss?--A Mathematical Framework
    Zhiyuan Li, Tianhao Wang, and Sanjeev Arora
    In The Tenth International Conference on Learning Representations, ICLR 2022
  9. Why Are Convolutional Nets More Sample-Efficient Than Fully-Connected Nets?
    Zhiyuan Li, Yi Zhang, and Sanjeev Arora
    In The Ninth International Conference on Learning Representations, ICLR 2021